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ABSTRACT 

We analyze 8 years of precise radial velocity measurements from the Keck Planet Search, characterizing the 
detection threshold, selection effects, and completeness of the survey. We first carry out a systematic search for 
planets, by assessing the false alarm probability associated with Keplerian orbit fits to the data. This allows us 
to understand the detection threshold for each star in terms of the number and time baseline of the observations, 
and the underlying "noise" from measurement errors, intrinsic stellar jitter, or additional low mass planets. We 
show that all planets with orbital periods P < 2000 days, velocity amplitudes K > 20 m s~\ and eccentricities 
e < 0.6 have been announced, and we summarize the candidates at lower amplitudes and longer orbital periods. 
For the remaining stars, we calculate upper limits on the velocity amplitude of a companion. For orbital periods 
less than the duration of the observations, these are typically 10 m s"', and increase cx for longer periods. 
We then use the non-detections to derive completeness corrections at low amplitudes and long orbital periods, 
and discuss the resulting distribution of minimum mass and orbital period. We give the fraction of stars with 
a planet as a function of planet mass and orbital period, and extrapolate to long period orbits and low planet 
masses. A power law fit for planet masses > 0.3 Mj and periods < 2000 days gives a mass-period distribution 
dN = CM'^P'^dlnMdlnP with a = -0.31 ± 0.2, /3 = 0.26 ±0.1, and the normaHzation constant C such that 
10.5% of solar type stars have a planet with mass in the range 0.3-10 Mj and orbital period 2-2000 days. 
The orbital period distribution shows an increase in the planet fraction by a factor of w 5 for orbital periods 
> 300 days. Extrapolation gives 17-20% of stars having gas giant planets within 20 AU. Finally, we constrain 
the occurrence rate of planets orbiting M dwarfs compared to FGK dwarfs, taking into account differences in 
detectability. 

Subject headings: binaries: spectroscopic — methods: statistical — planetary systems 
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1. INTRODUCTION 

Precise Doppler velocity surveys of nearby stars have led to 
the detection of more than 250 extrasolar planets (e.g. Marcy 
et al. 2005a; Butler et al. 2006a). They have minimum masses 
from 5 Earth masses (M^) and up, orbital periods from close 
to one day up to several years, and a wide range of eccen- 
tricities. Over 25 multiple planet systems are known, with 
many other single planet systems showing a long term ve- 
locity trend likely indicating a second planet with long or- 
bital period (Fischer et al. 2001). The increasing number of 
detections allows us to answer questions about the statistical 
properties of extrasolar planetary systems, such as the mass, 
period, and eccentricity distributions (Tabachnik & Tremaine 
2002; Butler et al. 2003; Fischer et al. 2003; Lineweaver & 
Grether 2003; Jones et al. 2003; Udry, Mayor, & Santos 2003; 
Gaudi, Seager, & Mallen-Ornelas 2005; Ford & Rasio 2006; 
Jones et al. 2006; Ribas & Miralda-Escude 2007), and the in- 
cidence of giant planets as a function of host star metallicity 
(Fischer & Valenti 2005; Santos et al. 2005) and mass (Butler 
et al. 2004b; Butler et al. 2006b; Endl et al. 2006; Johnson et 
al. 2007). 

In this paper, we focus on the frequency of planetary sys- 
tems, and the distributions of mass and orbital periods. The 
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frequency of planets is important for future astrometric and 
direct searches (see e.g. Beuzit et al. 2007). The details of the 
mass-orbital period distribution are important because they 
contain information about the planet formation process (Ar- 
mitage et al. 2002; Ida & Lin 2004a, 2004b, 2005, 2008a, 
2008b; Alibert et al. 2005; Rice & Armitage 2005; Kornet 
& Wolf 2006). Figure 1 shows the distribution of planet 
masses and orbital periods for 182 planets announced as of 
March 2007. Several features of the mass-period distribution 
have been discussed in the literature: the "pile-up" at orbital 
periods of w 3 days (the "hot jupiters") (e.g. see Gaudi et 
al. 2005); the paucity of massive planets (M > 1 Mj) in close 
orbits (Udry et al. 2002, 2003; Mazeh & Zucker 2002, 2003); 
the deficit of planets at intermediate orbital periods of 10 
and 100 days, giving a "period valley" in the orbital period 
distribution (Jones et al. 2003; Udry et al. 2003; Burkert & 
Ida 2007); and a suggestion that the lack of lower mass planets 
(M < Q.15Mj) at orbital periods 100-2000 days is signifi- 
cant and a real feature of the distribution (Udry et al. 2003). It 
has also been pointed out that the incidence and mass-period 
distribution of planets should depend on the mass of the host 
star In particular, a much lower incidence of Jupiter mass 
planets is expected around M dwarfs in the core accretion sce- 
nario for planet formation (Laughlin, Bodenheimer, & Adams 
2004; Ida & Lin 2005; Kennedy & Kenyon 2008 although 
see Kornet & Wolf 2006), and observational estimates sup- 
port this picture (Butler et al. 2004b; Butler et al. 2006b; Endl 
et al. 2006; Johnson et al. 2007). 

These interpretations are complicated by the fact that the 
mass-period distribution is subject to selection effects at low 
masses and long orbital periods. The important observa- 
tional quantity is the stellar velocity amplitude induced by the 
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Fig. 1. — Minimum mass (Msini) and period (P) distribution of 182 ex- 
trasolar planets detected by radial velocity searches announced as of March 
2007. The dotted lines show velocity ampUtudes of 3, 10 and 30 m s"' for a 
IMq star. We take the orbital parameters from the updated Butler et al. 2006 
catalog maintained at http://www.exoplanets.org. 



planet, which for a planet of mass Mp is 



K = 



28.4 m/s /Mpsini 
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where P is the orbital period, e is the eccentricity, is the 
mass of the star, and i is the incUnation of the orbit. The 
dotted lines in Figure 1 show A" = 3, 10 and 30 m s"^ for cir- 
cular orbits around a solar mass star. The detectable ampli- 
tude depends on the number and duration of the observations, 
and particularly on the Doppler measurement errors and other 
noise sources (see Cununing 2004 for a detailed discussion). 

Scatter in the measured radial velocity is expected from sta- 
tistical and systematic measurement errors, and intrinsic stel- 
lar radial velocity variations or "jitter". The typical statis- 
tical measurement error, which we refer to in this paper as 
the "Doppler error" or "Doppler measurement error", is de- 
termined by the uncertainty in the mean velocity of a large 
number of spectral segments. The Doppler errors are typi- 
cally 3-5 m/s in the data considered here, although Doppler 
errors as small as w 1 m/s are now routine (Mayor et al. 2003; 
Butler et al. 2004a). Sources of systematic measurement er- 
rors include imperfect PSF descriptions and deconvolution al- 
gorithms, and the characterization of the charge transfer in 
the spectrometer CCD (see discussion in Butler et al. 2006b). 
Stellar jitter is thought to arise from a combination of surface 
convective motions, magnetic activity, and rotation (Saar & 
Donahue 1997). The amount of jitter depends on stellar prop- 
erties such as rotation rate and spectral type, but is typically 
1-5 m/s for chromospherically quiet stars (Saar, Butler, & 
Marcy 1998; Santos et al. 2000; Wright 2005). Additional 
low mass planets in a system could provide another source of 
radial velocity variability. 

These various sources of noise determine the velocity 
threshold for detecting planets, and vary between observa- 
tions, different stars, and different surveys. Interpretation of 
the mass-period distribution at low masses requires a care- 
ful analysis of these selection effects. Most work to date 
has taken a fixed detection threshold, such as K = 10 ms"' 
(e.g. Udry et al. 2003) or a mass cut well above the masses 
at which selection effects should play a role (Lineweaver 



& Grether 2003). A few detailed calculations of detection 
thresholds have been carried out. In Cumming, Marcy, & But- 
ler (1999), we presented an analysis of 11 years of Doppler 
measurements of 76 stars as part of the Lick planet search. 
However, the conclusions regarding the mass-period distribu- 
tion were limited because only 6 planets were then known. 
Endl et al. (2002) present a statistical analysis of the 37 star 
sample observed by the ESO Coude Echelle spectrometer. 
Wittenmyer et al. (2006) present limits on companion mass 
for 3 1 stars observed at McDonald Observatory. The largest 
study so far is that of Naef et al. (2005), who derive detection 
thresholds for 330 stars from the ELODIE Planet Search and 
estimate planet occurrence rates. 

In this paper, we analyse 8 years of radial velocity measure- 
ments from the Keck survey, consisting of data taken from 
the beginning of the survey in 1996 to the time of the HIRES 
spectrometer upgrade in 2004 August. The number of stars 
(585) and planets (48) included in the sample offer an or- 
der of magnitude improvement over our previous Cumming et 
al. (1999) analysis, and therefore the best opportunity to date 
to determine the occurrence rate of planets and their mass- 
period distribution. An outline of the paper is as follows. In 
§2, we describe a technique for identifying planets in radial 
velocity data, discuss the detection thresholds for the survey, 
and calculate limits on the mass and period of planets orbit- 
ing stars that do not have a signilicant detection. In §3, we 
use these limits to correct the mass and period distributions 
for incompleteness, and then characterize the occurrence rate 
of planets and the mass-period distribution. We sunomarize 
and conclude in §4. 

2. SEARCH FOR PLANETS 

2.1. Observations 

The Keck Planet Search program has been in operation 
since 1996 July, using the HIRES echelle spectrometer on the 
Keck I telescope (Vogt et al. 1994; Vogt et al. 2000). The data 
we analyze here were taken from the beginning of the survey 
up to 2004 August when the HIRES spectrometer was up- 
graded. They consist of radial velocity measurements of 585 
F, G, K, and M stars (the fractions in these spectral classes 
are 7, 49, 27, and 16% respectively). Note that the M dwarf 
sample covers spectral types M5 and earlier; the F stars are 
of spectral type F5 and later. Selection of the target stars is 
described in Wright et al. (2004) and Marcy et al. (2005b). 
They lie close to the main sequence and are chromospheri- 
cally quiet. They have B-V > 0.55, declination > -35°, and 
have no stellar companion within 2". To ensure enough data 
points for an adequate Keplerian fit to the data, we further se- 
lect only those stars with at least 10 observations over a period 
of two years or more. This excludes data for an additional 360 
stars that were added in the two years prior to 2004 August. 
Figure 2 is a summary of the number, time baseline or dura- 
tion, and mean rate of observations. Typical values are 10-30 
observations in total over a duration of 6-8 years, with 3 ob- 
servations per year. The target list of stars has changed over 
the years of the survey, with stars being dropped and added 
(see Marcy et al. 2005b for a discussion of the evolution of 
the target list), resulting in the spread in durations shown in 
Figure 2. 

Figure 3 is a summary of the statistical Doppler measure- 
ment errors and the estimated jitter from Wright (2005). The 
Doppler error shown is the mean Doppler error for all obser- 
vations for each star. These are statistical errors determined 
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Fig. 2. — The number, duration, and average rate of observations, for the 
585 stars in the sample (all of which have 10 or more observations over a 
period of two years or more). The shaded histograms are for the subset of 
stars with M* < 0.5Mq. The dotted histograms are for the subset of 48 stars 
with an announced planet, and have been multiplied by a factor of 10 for 
clarity. 



by the weighted uncertainty in the mean velocity of 400 spec- 
tral segments, each 2A long. Most stars have a mean Doppler 
measurement error of « 3 m/s (smaller errors of w 1 m s~' 
have been achieved at Keck following the 2004 HIRES up- 
grade, but these data are not included in this analysis). The 
stars in the sample have been selected to be chromospheri- 
cally inactive, but still show stellar jitter at the few m s" level 
(Saar et al. 1998). We adopt the jitter estimates described in 
Marcy et al. (2005b) and Wright (2005), in which the level 
of jitter is calibrated in terms of stellar properties, in partic- 
ular B-V, My, and Rhk- These values include contributions 
from both intrinsic stellar jitter and systematic measurement 
errors. The typical jitter is (Tjitter ~ 3 m/s for a chromospher- 
ically quiet star, with a tail of larger values for more active 
stars. This is comparable to the Doppler measurement errors, 
indicating that the velocity measurements have reached a pre- 
cision for which stellar jitter and systematic errors are begin- 
ning to dominate the uncertainty. 
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Fig. 3. — The mean Doppler velocity measurement error, and estimated 
jitter, for the stars in the sample. The jitter estimates are taken from Wright 
(2005) and include both intrinsic stellar jitter and systematic measurement 
errors. The shaded histograms are for the subset of stars with Af* < 0.5Mq . 

In Figures 2 and 3 the dotted histograms summarize the ob- 
servations of the 1 10 stars with stellar masses < 0.5 Mq, 
which are almost entirely M dwarfs. In §3, we analyze 
the mass-orbital period distribution separately for stars with 
Mi, > O.SMq and < O.5M0, since the planet occurrence 
rate for M dwarfs appears to be smaller than that for FGK 
stars. Figures 2 and 3 show that in general the Doppler errors 
for the M dwarfs are larger than for the FGK stars, mostly due 
to their relative faintness. The dashed histograms in Figures 2 
are for the subset of stars with an announced planet (see §2.4). 
Once a candidate planetary signature is detected in the data, 
we increase the priority of that star in our observing schedule, 
resulting in a greater number of observations for those stars 
with announced planets. 

We include the Doppler errors and jitter by adding them in 
quadrature to find a total estimated error for each data point 

^tot.! = ^? + i^ltter- Figure 4 shows the distribution of the 
residuals after subtracting the mean velocity for a set of 386 
"quiet" stars which after a preUminary analysis show no ex- 
cess variabiUty, long term trend, or evidence for a periodicity. 
We plot a histogram of the ratio v,/crtot,i for all 3436 veloci- 
ties for these stars, compared with a normal distribution. The 
width of the observed distribution is « 30% greater than a unit 
variance Gaussian, suggesting that the estimated variabihty is 
underpredicted by this factor. To allow for this, we have mul- 
tiplied each (Ttot., by a factor of 1 .3 in the analysis that follows. 
The dotted histogram shows a normal distribution with (7=1.3 
for comparison with the observed distribution. The Gaussian 
distribution underpredicts the tails of the distribution some- 
what, but otherwise agrees quite well. The magnitude of this 
correction has only a small effect on the results of this paper, 
because when assessing the significance of a Keplerian orbit 
fit, we calculate ratios of chi-squared (for example, the ratio 
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of chi-squareds with and without the Keplerian orbit included 
in the model) and so the role of the errors crtot./ is to set the 
relative weighting of different data points (see Cumming et 
al. 1999 for a discussion). 

2.2. Long Term Trends and Excess Variability 

The first indication of a companion is often excess velocity 
variability above the level expected from measurement errors 
and stellar jitter. To assess this, we fit a straight line to each 
data set, and compare the observed scatter about the straight 
Une to the predicted value. The data for each star are a set 
of measured velocities {v,}, observation times {f,}, and esti- 
mated errors {crtot,;}- The estimated error crtot,; is the quadra- 
ture sum of the Doppler error and stellar jitter, as described 
in §2.1. We fit either a constant (f, = a) or a straight line 
ifi = a + bti) to the data. To test whether including a slope 
significantly improves the chi-squared of the fit, 
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we use an F-test (Bevington & Robinson 1992). The appro- 
priate F-statistic is 
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(Bevington & Robinson 1992), where Xconstant the for 
the best-fitting constant, and x^iope is the x^ for the best- 
fitting straight line. We determine the probability that the 
observed value of ^(iv-i),! is drawn from the corresponding 
F-distribution. If this probability is smaller than 0.1%, we 
conclude that the slope is significant. We make the choice of 
0.1% so that we expect no false detections in our sample of 
585 stars. We find that 95 stars (16% of the total) have a sig- 
nificant slope. This fraction is consistent with the 20 out of 
76 stars in the Lick sample, or 26%, that showed a long-term 
trend at the 0.1% significance level (Cumming et al. 1999). 

Having decided whether a constant or a straight line best de- 
scribes the long term behavior, we then test whether the resid- 
uals to the fit are consistent with the expected variability. We 
calculate the probability that Xconstant or Xdope drawn from a 
chi-squared distribution with A^- 1 or N-2 degrees of freedom 
respectively (Hoel, Port, & Stone 1971). We again choose a 
0.1% threshold, so if this probability is smaller than 0.1% we 
infer that there is excess variabiUty in the data. We find that 
of the 585 stars, 131 show significant variability at the 0.1% 
level, or 23% of the total. Of these 131 stars, 34 (26%) also 
show a significant slope (i.e. 6% of the 585 stars show both 
significant variability and a significant slope). This is simi- 
lar to the Lick survey results, where we found 17 out of 76 
stars, or 22%, with excess variability (Cumming et al. 1999; 
see also Nidever et al. 2002 who found that 107 out of 889 
stars showed velocity variations of more than 100 m s~' over 
4 years). 

2.3. Keplerian Fitting 

To search for the signature of an orbiting planet, we fit Kep- 
lerian orbits to the radial velocities, and assess the significance 
of the fit (Cumming 2004, hereafter C04; Marcy et al. 2005b; 
O'Toole et al. 2007). The non-linear least-squares fit of a Ke- 
plerian orbit requires a good initial guess for the best-fitting 
parameters since there are many local minima in the space. 
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Fig. 4. — The distribution of measured velocity i', normalized by the es- 
timated variability trtot., , the quadrature sum of Doppler measurement errors 
and estimated stellar jitter. The velocities used here are for a subset of quiet 
stars, after subtracting the mean for each data set. The dashed histogram 
shows a unit Gaussian. We compare the observed distribution (solid curve) 
with a Gaussian with errors increased by 30% (dotted curve). 

Our approach is to first limit the fit to circular orbits, and use 
the best- fitting circular orbit parameters as a starting point for 
a full Keplerian fit. 

It is important to note here that we search only for a single 
planet. We do not consider multiple planet systems in this pa- 
per, therefore our results for the mass and orbital period distri- 
butions apply only to the planet with the highest Doppler am- 
pUtude in a given system. To be consistent in this approach, 
we do not include any long term trends detected in §2.2 in the 
planet fits, and compare only to a constant velocity when as- 
sessing the significance of a given Keplerian fit. A star with a 
significant long-term trend will therefore be flagged as having 
a significant Keplerian fit with orbital period much longer than 
the duration of the observations. The multiplicity of planets 
as a function of their orbital periods is an important question 
that we leave to future work. 

To find the best-fitting circular orbit, we calculate the 
Lomb-Scargle (LS) periodogram (Lomb 1976; Scargle 1982) 
for each data set. This involves fitting a sinusoid plus con- 
stant*" to the data for a range of trial orbital periods P. For 
each period, the goodness of fit is indicated by the amount 
by which including a sinusoid in the fit lowers x^ compared 
to a model in which the velocity is constant in time. This is 
measured by the periodogram power 
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where Xconstant = E,(vr 



(v))^/ CTfoj is the x^ of a fit of a con- 



stant to the data, Xdrc is the x^ of the circular orbit fit, (v) is 
the mean velocity, oj = 2tt / P is the trial orbital frequency, and 
Wo the best-fitting frequency. The number of degrees of free- 
dom in the sinusoid fit is A^- 3. A large power z indicates that 
including a sinusoid significantly decreased the x^ of the fit. 
We consider trial orbital periods between 1 day and 30 years. 

We next choose two best fitting solutions as starting points 
for a full non-linear Keplerian fit. The two sinusoids are cho- 

* Note that we extend the original LS periodogram by allowing the mean 
to "float" at each frequency, following Walker et al. (1995), Nelson & Angel 
(1998), and Cumming et al. (1999), rather than subtracting the mean of the 
data prior to the fit. 
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sen so that they are well-separated in frequency. We then use 
a Levenberg-Marquardt algorithm (Press et al. 1992) to search 
for the minimum x^, starting with a Keplerian orbit with the 
same period and amplitude as the sinusoid fits, and trying sev- 
eral different choices for the time of periastron, eccentricity, 
and longitude of pericenter. Having obtained the minimum 
from the Keplerian fit, we define a power Ze to measure the 
goodness of fit analogous to the LS periodogram power for 
circular orbits. 



/ Xconstant-XKep(^) 
4 V XKep('^o) 



(5) 



(C04), where x^ep of a fit of a Keplerian orbit to the 

data (with degrees of freedom). 

The significance of the fit depends on how often a power 
as large as the observed power zq would arise purely due to 
noise alone (C04; Marcy et al. 2005b). For a single frequency 
search, the distribution of powers due to noise alone can be 
written down analytically for Gaussian noise, it is 

-(i/+2)/2 



Prob(z>zo)= 1 + 



(1^+2) 4zo 



1 + 



4zo 



(6) 



(C04), where u =N-5. However, the total false alarm prob- 
ability depends on how many independent frequencies are 
searched. For a search of many frequencies, each indepen- 
dent frequency must be counted as an individual trial. The 
false alarm probability (FAP) is 



F = 1 - [1 -Prob(z > zo)P « Nf Prob(z > zo), 



(7) 



where N/ is the number of independent frequencies, and in 
the second step we assume F is small. For small F, the FAP 
is simply the single frequency FAP multiplied by the number 
of frequencies. 

An estimate of the number of independent frequencies 
is Nf « TAf, where A/ = /2 - /i is the frequency range 
searched and T is the duration of the data set (C04). For 
evenly-sampled data, the number of independent frequen- 
cies is N/2, ranging from l/T to the Nyquist frequency 
ff^y = N/2T. For unevenly- sampled data. Home & BaUunas 
(1986) found that Nf ^ N for a search up to the Nyquist fre- 
quency (see also Press et al. 1992). This agrees with our sim- 
ple estimate since Nf ~ f2T « N/2. However, uneven sam- 
phng allows frequencies much higher than the Nyquist fre- 
quency to be searched (see discussion in Scargle 1982). In 
general, Nf ^ N, hy a factor of « fi/fwy For example, a 
set of 30 observations over 7 years has fyy « 1/(6 months). 
A search for periods as short as 2 days therefore has Nf w 
Ni6 months/2 days) f» 90A' 2700. Therefore to detect a sig- 
nal with a FAP of 10"^, the periodogram power for that signal 
must be large enough, or the Keplerian fit good enough, that 
the single-trial false alarm probability is ^ 10"^. 

The estimate for Nf and equations (6) and (7) allow an an- 
alytic calculation of the false alarm probability (see Fig. 2 of 
C04). To check this analytic estimate, we determine the FAP 
and Nf numerically using Monte Carlo simulations. The dis- 
advantage of calculating the FAP in this way is that it is com- 
putationally intensive for Keplerian fits. This is the reason 
why we consider only the two best-fitting sinusoid models as 
starting points for the Keplerian fit. We find that the analytic 
estimate of the FAP agrees well with the FAP determined by 
the Monte Carlo simulations. Our method is to generate a 
large number Aftnais of data sets consisting of noise only, using 
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Fig. 5. — The results of the search for significant Keplerian fits. The 
solid triangles (76 points) and open circles (48 points) show periodicities with 
FAP< Ur^ The open circles indicate stars that have an announced planet. 
The vertical dashed line shows P = 8 years, corresponding to the duration 
of the survey. Dotted lines show the velocity ampUtude corresponding to 
Msin; = 0.1, 1, and 10 Mj for a solar mass star. 



the same observation times as the data, and calculate the max- 
imum power for each of them. The fraction of trials for which 
the maximum power exceeds the observed value then gives 
the false alarm probability. In addition, by fitting equation (7) 
to the numerical results, we can determine Nf. This allows a 
determination of the FAP even when it is much smaller than 
1 /Mriais (C04). We generate the noise in two ways, which give 
similar results for the FAP (see Cumming et al. 1999): (i) by 
selecting from a Gaussian distribution with standard deviation 
given by the observed error atot,i for each observation time, or 
(ii) by selecting with replacement from the observed veloci- 
ties (after first subtracting the mean). The second approach (a 
"bootstrapping" method, see Press et al. 1992) has the advan- 
tage that it avoids the assumption of Gaussian noise; instead 
the actual velocity values are used as an estimate of the noise 
distribution. It is similar to the "velocity scrambling" method 
of Marcy et al. (2005b) for determining the false alarm proba- 
bility for Keplerian fits, with the main difference being that we 
select with replacement from the observed velocities rather 
than randomizing the order of the observations. 

2.4. Significant Detections 

Figure 5 shows the results of the search for significant Ke- 
plerian fits. We plot the best-fit amplitude and period for stars 
with FAP< 0.1%, divided into two categories. The open cir- 
cles are for stars with an announced planet (i.e. a published 
orbital solution) as of 2005 May; the solid triangles are stars 
with a significant Keplerian fit, but not confirmed as a planet^. 
We will refer to these significant detections that are not an- 
nounced as planets as "candidate" planets. There are 48 an- 
nounced planets detected in our search (8.2% of the total num- 
ber of stars), and 76 candidates (13% of the total). AH but five 

' We choose 2005 May as the cutoff for publication of an orbital solution 
so that the announcement of the planet is based only on the data considered 
here, which is taken before 2004 August. In fact, 7 of our 78 candidates have 
since been announced based on additional data collected since 2004 August. 
They are four of the five planets announced by Butler et al. (2006a), the planet 
orbiting the M dwarf GJ 849 (Butler et al. 2006b), and the two long period 
companions with incomplete orbits described by Wright et al. (2007). 
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of the announced planets in our sample of stars were detected 
by our algorithm (the 5 planets that were not detected orbit 
stars that were added to the survey only recently to confirm 
detections by other groups, see discussion below). 

Of the 76 candidates, 27 have large velocity amplitudes 
K > 200 m s~' corresponding to companion masses > 20 Mj. 
The remaining 49 candidates, which have Mpsini < 15 Mj, 
fall into two groups: they are either at long orbital periods 
(P > 2000 days), or low amplitudes (K < 10-20 m s"') com- 
pared with the armounced planets. That these are the detec- 
tions that have not yet been announced makes sense since (i) 
it is difficult to constrain orbital parameters for a partial orbit, 
and so we generally wait for completion of at least one full or- 
bit before announcing a planet, and (ii) for low amplitude sig- 
nals, it becomes difficult to disentangle possible false signals 
from stellar photospheric jitter or from systematic variations 
in the measured velocities. 

In the first case (long orbital periods), it is simple to under- 
stand the detection limit, which is set by the time baseline of 
the observations. The vertical dashed line in Figure 5 shows 
an orbital period of 8 years (the duration of the longest data set 
considered here), and nicely divides the announced and candi- 
date planets. In the second case (low amphtudes), an interest- 
ing question to consider is how the threshold for announcing a 
planet relates to the statistical threshold for detecting a Keple- 
rian orbit. The statistical detection threshold depends on both 
the number of observations and signal amplitude. For circular 
orbits, an analytic expression for the signal to noise required 
to detect a signal with 50% probability is given by 



100 



K 

V2s 



-I 1/2 



(8) 



(Gumming et al. 2003; C04), where F is the false alarm 
probability associated with the detection threshold, Nf is the 
number of independent frequencies, s is the noise level, and 
ly = N-3. The number of independent frequencies Nj- is set 
by the number of observations A' and the duration of the ob- 
servations T, as we discussed in §2.3. Figure 6 compares the 
signal to noise ratio for each detection with this analytic re- 
sult. For this comparison, we define the noise s to be the rms 
amplitude of the residuals to the best fit orbit, and the signal to 
noise ratio as K/s. We show only those detections with best 
fitting period P < 1000 days and mass M^sini < 15M/, for 
which signal to noise is expected to be the main limiting fac- 
tor. The dotted curves show the analytic result for a detection 
probabiHty of 50% and 99%. 

The candidate detections mostly fall near the detection 
curves in Figure 6, whereas the announced planets lie above 
the curves. The five crosses represent planets that were de- 
tected by other groups. Their host stars, HD 8574, HD 74156, 
HD 82943, HD 130322, and HD 169830, were added to the 
Keck survey to confirm these detections, but do not yet have 
enough observations for a detection. They all lie below the 
dotted curves. 

Inspection of Figure 6 shows that the detection threshold is 

determined by statistics when the signal to noise ratio is larger 
than 2-3. For lower amplitude signals, the statistical signifi- 
cance is no longer enough, since as we mentioned there is the 
danger that the observed variation is in fact from stellar jitter 
or systematic effects. Figure 6 shows that the effective detec- 
tion threshold is then at a signal to noise of « 2. To detect a 
planet with a lower amplitude than this requires significantly 
more work to rule out false signals. 




100 

Number of Observations 

Fig. 6. — Parameter space of signal to noise ratio K/s against number of 
observations for the significant detections. The "noise" is the standard de- 
viation of the residuals to the best-fit Keplerian orbit. We plot only those 
points with periods < 1000 days, and planet mass < 15 Jupiter masses, for 
which signal to noise is expected to be the limiting factor The open circles 
indicate stars that have an announced planet. The solid triangles show candi- 
date planets which have FAP< 10"'. The crosses show the five planets that 
were announced by other groups, but not flagged as significant in our search, 
which includes only a small number of observations for these stars. The dot- 
ted curves show analytic results for the signal to noise needed for a detection 
probabihty of 50% (lower curve) and 99% (upper curve) (see eq. [8]; we 
assume iV//f = 10*). 

Comparison with the published orbits shows that our auto- 
mated technique reproduces the fitted orbital parameters well, 
except for 2 of the 48 announced planets. For HD 50499, 
we find an orbital period of 2 x 10"* days rather than 3000 
days. We include a slope in the fit for this star, which repro- 
duces the published orbital parameters. We find a period of 
15 rather than 111 days for the highly eccentric planet around 
HD 80606 (which has e = 0.93; Naef et al. 2001). These 
examples illustrate the weakness of the Lomb-Scargle peri- 
odogram at providing a good initial guess for the Keplerian 
fit, in particular for eccentric orbits, and emphasises the im- 
portance of trying many different starting periods. There are 
also strong aliasing or spectral leakage effects at 1 day, and 
so when fitting Keplerian orbits, we force the orbital period to 
be longer than 1 .2 days rather than the 1 day Umit of our pe- 
riodogram search. We might also expect the search to fail for 
multiple planet systems; however, in all cases of announced 
multiple planet systems, we find a significiant single planet 
fit, usually for the planet with the largest velocity amplitude. 
This is the case even when the periods are close to or are har- 
monically related. For example, HD 128311 has two planets 
in a 2:1 resonance (Vogt et al. 2005). We detect the longer 
period planet in our search. 

2.5. Upper Limits 

We next calculate upper limits on the radial velocity am- 
plitude of planets for those stars without a significant detec- 
tion. To reduce the computational time and because we are not 
considering the eccentricity distribution in detail in this pa- 
per, we place upper limits on the amplitude of circular orbits 
only. Endl et al. (2002) and C04 showed that the detectabil- 
ity of a planet in an eccentric orbit is only slightly affected 
by eccentricity for e < 0.5, but can be substantially affected 
for larger eccentricities if the phase coverage is inadequate 
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Fig. 7. — Upper panel: histogram of the 99% upper hmit on velocity 
amplitude K of circular orbits for 461 stars without significant detections. 
The upper limit is averaged over orbital periods smaller than the duration of 
the observations (P < T). Lower panel: the fraction of stars whose upper 
limit is less than a given value of K. In each panel, the solid histogram is 
for the entire sample of stars. The dotted histogram is for the 103 stars with 
M* <0.5Mq. 



(e.g. see Fig. 6 of C04). About 14% of the known planet pop- 
ulation has e > 0.5. In our sample, 6 stars out of 48 announced 
planets have e > 0.5 (13%), and 3 have e > 0.6 (6%). How- 
ever, the true fraction with eccentricities greater than 0.5 may 
be larger because of the selection effects acting against highly 
eccentric orbits. Of the 76 candidate periodicities, 30 have 
e > 0.5, and 1 1 have e > 0.6. The larger fraction of eccentric 
orbits for these candidates than for the announced planets is 
Ukely due to the long orbital periods of many of the candidates 
for which the orbital eccentricity is not weU-constrained. We 
leave a discussion of the eccentricity distribution to a future 
paper, and here assume circular orbits. 

For all stars with FAP> 10"^, we calculate upper limits as 
described in Gumming et al. (1999), utilizing the LS peri- 
odogram for sinusoid fits. At a given orbital period, we gen- 
erate fake data sets of a sinusoid plus Gaussian noise. We 
assume that the amplitude of the noise is equal to the rms of 
the residuals to the best fitting sinusoid for the actual data. We 
then find the sinusoid amplitude that gives a LS periodogram 
power larger than the observed value in 99% of trials. We 
calculate the upper limit on as a function of orbital pe- 
riod. However, the upper limit is insensitive to period for 
P < T because the uneven sampling gives good phase cov- 
erage for most periods (Scargle 1982; Gumming et al. 1999). 
For P>T, the 99% upper limit scales close to oc (for 
periods T <P< 1007rr/8 « 300 years; see G04). 

Figure 7 is a histogram of the mean upper limit on K for 
orbital periods shorter than the duration of the observations 




100 1000 
Period (d) 

Fig. 8. — Significant periodicities (FAP< 10~^; open circles and solid 
triangles) in the mass-period plane. For a given period, the solid curves show 
the mass that can be ruled out at the 99% level from 5%, 20%, 50%, 80%, and 
95% of stars. Note that whereas the search for planets involves full Keplerian 
fits to the data, the upper limits are for circular orbits only (applicable to 
Keplerian orbits with e < 0.5, see discussion in §2.5). The dotted lines show 
velocity amplitudes K = 3, 10, and 30 m s~' for a 1 Mq star. The vertical 
dashed line shows P=8 years, corresponding to the duration of the survey. J 
and S label the locations of Jupiter and Saturn in this plot. 

(P < T)^. Most Stars have upper limits to K of between 10 
and 15 ms"'. Taking a typical value for the quadrature sum of 
jitter and Doppler errors to be cr w 3\/2 m s"' (since both jitter 
and Doppler errors are typically 3 m s~'), these upper limits 
correspond to signal to noise ratios K/a of between 2 and 3. 
This compares well with the analytic formula in equation (8) 
which gives K^2a for N = 20 and Nf = 1000. Notice that 
~ 10% of stars have upper limits > 30 m s~'. This is due to 
either large Doppler measurement errors for these stars, or a 
number of observations that is too small to give a good hmit. 
The dotted histogram shows the upper limits for the M dwarfs 
only (M* < 0.5 Mq). The upper limits are on average higher 
for these low mass stars. Radial velocity measurements are 
more difficult for M dwarfs because they are much fainter than 
solar mass stars. 

2.6. Summary of Results 

Figure 8 is a summary of the analysis in this section in the 
mass-orbital period plane. To convert between velocity ampli- 
tude K and Mp sin;, we require the stellar mass (eq. [1] gives 
Mpsini oc KM^^). For the stars with significant Keplerian 
fits, either announced or candidate planets, we use the latest 
mass estimates given in the Takeda et al. (2007) and Valenti & 
Fisher (2005) catalogs (except for 7 of the candidate stars that 
are not listed in Takeda et al. 2007 or Valenti & Fischer 2005). 
For convenience, approximate stellar masses of the remaining 
stars are determined using the B— V stellar mass relation given 
in Allen (2000)''. 

We show the detections with FAP < 10"^ in Figure 8 as 

* We average the upper limits over period here only to summarize the re- 
sults of our calculation. In the next section, where we correct for incomplete- 
ness, we do not average over period but instead use the upper limits calculated 
as a function of period. 

' This relation underestimates the stellar mass for some stars, typically by 
30% but sometimes as much as a factor of 2, for those stars which are metal 
rich. Therefore we expect the blue curves showing upper limits on Mp sin; 



8 



solid triangles and open circles, dividing them into candidates 
and announced planets respectively. As expected from the 
discussion in §2.4, the candidate periodicities (solid triangles) 
are mainly concentrated at A' < 10 m s~' or at P > 1000 days. 
The solid curves in Figure 8 summarize the upper limits as a 
function of period. For a given period, they show the mass 
which can be excluded at the 99% level from 5%, 20%, 50%, 
80%, and 95% of stars. The aliasing effects are not strong: 
the upper limits vary smoothly with period because of the un- 
even time sampling which gives good phase coverage at most 
orbital periods (e.g. Scargle 1982). However, there is a slight 
reduction in sensitivity at periods close to 1 year, 2 years, and 
1 day. The curves turn upwards and scale as for periods 
beyond « 3000 days (« 8 years), close to the duration of the 
observations. 

The results shown in Figure 8 allow us to draw a number 
of conclusions. First, there are many candidate gas giants in 
orbital periods of 5-20 years, similar to our Solar System. 
Figure 8 shows that the main limitation at the moment for de- 
tection of an analog of our Solar System is the duration of the 
survey, rather than the sensitivity. We suspect that some of 
these long period candidates will turn out to be more massive 
that the M sin / indicated in Figure 8, since only a partial orbit 
has been observed, leaving the fitted mass uncertain. There 
are several candidates with K < 10 m s"'. For these candi- 
dates, further observations are needed to rule out stellar jitter 
as the cause of the observed periodic variability. The upper 
limit curves continue smoothly to periods smaller than 3 days, 
and are not noticeably affected until they get close to 1 day. 
This implies that the abrupt drop in the number of planets at 
^oib ^ 3 days is a real feature. In the "period valley" between 
10-100 days (Jones et al. 2003; Udry et al. 2003), the de- 
tectability is good, with upper limits of « 0. 1-0. 2My for 50% 
of stars. However, for periods > 100 days, the upper limits are 
larger, O.3-O.6M7. Therefore the reported lack of objects with 
M < 0.75My at f > 100 days by Udry et al. (2003) is clearly 
dependent on understanding the selection effects in this re- 
gion. We address this in the next section by using these upper 
limits to correct the observed period and mass distributions 
for incompleteness. 



3. THE MASS-PERIOD DISTRIBUTION 

In this section, we describe a method for correcting for in- 
completeness by taking into account the non-detections, and 
discuss the resulting distribution of minimum masses and or- 
bital periods. For conciseness we will refer to the minimum 
mass Mpsini as "mass" throughout this section, although it 
should be noted that we do not include the distribution of in- 
clination angles / which is needed to determine the true mass 
from the measured minimum mass (Jorissen, Mayor, & Udry 
2001; Zucker & Mazeh 2001). This is reasonable for a large 
statistical sample. The average value of the ratio of minimum 
mass to true mass is (sin/) = tt/4 = 0.785, a small correction 
for this analysis, and we also note that power law scalings 
are not affected by the unknown sin/ factors (Tabachnik & 
Tremaine 2002). 

as a function of period averaged over all stars in Figs. 8 and 9 to be uncertain 
at the 20% level because of this approximation. However, note that this 
uncertainty does not affect the completeness corrections and mass and period 
distributions calculated in §3 because they depend on the velocity upper limit 
directly. The only stellar mass information needed in §3 is for the announced 
and candidate planets, for which we use the accurate Takeda et al. (2007) or 
Valenti & Fischer (2005) masses. 
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Fig. 9. — Detections for F, G, and K stars corrected for completeness of 
the survey. We show the announced planets (those pubhshed by 2005 May; 
black circles) and candidates (significant detections not corresponding to a 
published planet; green circles), with the area of the circle indicating the size 
of the completeness correction Ni for each point. The vertical dashed line 
indicates a period of 8 years, roughly equal to the duration of the survey. The 
dotted lines show velocity amplitudes of 3, 10 and 30 m s"' for a IMq star. 
The blue curves summarize the upper limits as in Fig. 8. 

3.1. Including the upper limits 

Several methods have been discussed in the literature for 
finding the distribution most consistent with a set of detections 
and upper limits (Avni et al. 1980; Feigelson & Nelson 1985; 
Schmitt 1985). Such data are known as censored data, and the 
analysis as survival analysis. Correcting for the upper limits 
involves counting which of them usefully constrain the distri- 
bution at a given point. Avni et al. (1980) present a method for 
doing this counting for a one-dimensional distribution. Here, 
we generalize this approach to the two-dimensional mass- 
period plane. We follow Avni et al. (1980) and present a 
heuristic derivation in this section; a more detailed derivation 
by a maximum likelihood method is given in the Appendix. 
The reader may find it useful to compare our discussion with 
§111. D of Avni et al. (1980) which describes the I -dimensional 
case. 

It is instructive to consider a hypothetical example to de- 
velop some intuition. Imagine that after looking at a given 
star there were three possible outcomes: we either (i) detect a 
planet, (ii) completely exclude the presence of a planet, or (iii) 
are not able to say anything (i.e. cannot exclude or confirm the 
presence of a planet). First, we observe A^* stars with the re- 
sult that A^pianet plancts are found and we are able to rule out 
planets from all of the remaining stars. The best estimate of 
the fraction of stars with planets is then / = A^pianet/A^*- How- 
ever, what if we are able to rule out planets only from a subset 
of stars, A^no planet < A^* " A^pianet ? In this casc, the best esti- 
mate of the planet fraction is to take the ratio of the num- 
ber of detections to the number of stars for which a detection 
was possible, / = A/pianet/(A'no planet + A'pianet)- We must take 
A^no pianet+A^pianet < A^* as the denominator bccausc for each de- 
tected planet, the actual pool of target stars is less than the to- 
tal pool, because the data for some of the stars are inadequate 
to detect that planet. As an extreme case, consider looking 
at 100 stars, and being able to say that one has a planet, one 
does not, and nothing about the remaining 98 stars (A^pianet = 1 
and A^no planet = !)■ The best estimate for the planet fraction is 



9 



then 50%. The extra 98 stars for which no useful upper limit 
could be obtained do not contribute to the estimate. This sim- 
ple example tells us how to interpret Figure 8. In a region of 
the mass-period plane in which planets can be ruled out for a 
fraction k of stars, the best estimate of the number of planets 
is « 1/fc times the number of detections. 

Our approach is to assign an effective number A^, for each 
detected planet /. If selection effects are unimportant, A^, = I, 
so that the detected planet is a good representation of the num- 
ber of planets at that mass and period. However, Ni will be 
greater than 1 if planet i has orbital parameters in part of the 
mass-period plane where completeness corrections are impor- 
tant. For instance, if the completeness for planets at a given 
mass and period is 50%, then A^, for a planet ; at that mass and 
period will be 2. The idea behind this method is that we are 
samphng the mass and period distribution at the discrete set of 
points corresponding to the mass and periods of the detected 
planets. Of course, the underlying distribution is likely to be 
smooth, and so the quantity Ni/Ni, should be thought of as the 
probability that a star has a planet with mass and period close 
to those of planet i. The normalization of A^, is such that the 
total fraction of stars with planets is Y^iNi/Nj,. 

To calculate A'; for a given star with a detected planet, we 
must count how many of the stars with non-detections could 
have an undiscovered planet with the same mass and period 
as planet We therefore consider each non-detection in turn 
and ask whether the upper limit K^p calculated in §2.5 allows 
a companion with period Pi and to be present. If so, we 
must increase A^, to allow for this incompleteness. However, 
the upper limit K^p allows a hypothetical planet to be present 
with any amplitude or orbital period that satisfies K < Xup, 
not just the orbital parameters of planet ;. Therefore, rather 
than increasing A', by 1 for every upper limit that allows addi- 
tional undiscovered planets at that location, we must increase 
A', by the probability that the hypothetical planet would have 
the same parameters as planet i. In effect, we "share out" the 
hypothetical planet amongst all the possible locations in mass 
period space that are allowed by the upper limit. This leads to 
the following rule for increasing Ni each iteration: 



Nl 



•n+l 



1 + 



J,f^up.j>Ki 



(9) 



where n labels the iteration and the sum is over all non- 
detections j which allow an undiscovered planet with the 
properties of planet ; to be present. Here, for each of the non- 
detections considered in the sum, ^ is the upper limit on 
the velocity amplitude (§2.5) at the orbital period of planet i. 
The second term in equation (9) is the probability that a planet 
with a velocity amplitude below the upper limit has the period 
and amplitude corresponding to planet ;. It is weighted by the 
normalization factor Z in the denominator which counts all 
the possibilities consistent with the upper limit: these are ei- 
ther (i) a planet is present with K < K^p, or (ii) no planet is 
present. Mathematically, we write this as the probability that 
the star does not have a planet with an amplitude that violates 
the upper limit (K > /Tup), 



Z(^:up) = l- 



E 



N.' 



(10) 



where the sum is over all detected planets i whose velocity 
amplitude A", exceeds K^p . 

For example, suppose we wish to calculate A^, for planet j 
with K = 20 ms~^. We start with our initial guess of Nf = 1 



for all /. We then turn to planet j, and, for that planet, con- 
sider each star with a non-detection in turn. Let us say that 
the first such star has high jitter or a small number of data 
points, so that the upper limit on the velocity amplitude of a 
companion is 30 m s~' . This means that a companion with 
an amplitude of 20 ms~\ the same as planet j, cannot be 
ruled out, and so we must add a contribution to Nj. To 
do this, we first use equation (10) to calculate Z(30 m s"'), 
where the sum is over all detected planets with AT > 30 m s"^ 
If there are 30 such planets, and 500 total stars, then using 
the current values A^'' = 1 (this is the first iteration), we find 
Z(30ms-i)= 1-30/500 = 0.94. This is the probability (given 
our current values for Ni) that a star does not have a planet 
with A" > 30 m s~^ We use this value in the first term of 
the sum in equation (9), which means that we add an amount 
(l/500)(l/0.94) = 2.1 X 10-2 to NJ. Because we know that 

there is no planet with A" > 30 m s"' around this star, the prob- 
ability that there is a planet with the properties of planet j is 
larger than 1 /500. We now continue with the sum. Let us 
say that the second star with a non-detection is well-observed 
and those observations rule out a planet with the properties of 
planet j. This star then contributes nothing to the sum in equa- 
tion (9). We continue for all stars with non-detections to com- 
plete the sum in equation (9) and arrive at the new value Nj . 
We then repeat this calculation to obtain Nj for all detected 
planets i. This entire procedure is iterated until all values of 
A^, have converged. 

An important question is how much our results depend on 
the candidate detections, since these detections await further 
observations before they can be confirmed as being due to 
an orbiting planet. Long period signals need further obser- 
vations to cover at least one orbit; low amplitude signals are 
potentially due to other factors such as stellar jitter Therefore 
it is likely that some of these candidate periodicities are not 
due to a planet and should not be included. Even if they are 
due to a planet, as more data are collected, the best-fit orbital 
parameters of these candidates may change. To answer this 
question, we have calculated the values of A^,- both with and 
without the candidate detections, by either including the can- 
didate detections or by treating them as non-detections, with 
the upper Umit K^p given by the detected velocity amplitude. 
We find that our conclusions regarding the mass-orbital period 
distribution are similar in each case (see, for example Table 1 
discussed below). 

3.2. A power law fit using maximum likelihood 

As an alternative to the non-parametric description of the 
period and mass distribution that we described in the last sec- 
tion, we also fit the distribution with the simplest parametric 
model, a power law in mass and period. One way to do this 
would be to take the corrected data given by the A', values we 
have described, and fit a power law to the histogram or the 
cumulative distribution. Instead, we have used a maximum 
likelihood technique to fit a power law in mass and orbital pe- 
riod simultaneously to the original data (see also Tabachnik 
& Tremaine 2002). In the Appendix we start with the same 
likelihood that is used to derive the non-parametric technique 
described in §3.1, but instead consider a parametric model in 
which the probablility of a star having a planet at mass M and 
period P is dN = CM"P^ dlnM dlnP, where C is a normal- 
ization constant. The resulting expression for the likelihood 
is given in equation (A 19), and we evaluate this numerically 
and maximize it with respect to the two parameters a and (3. 
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TABLE 1 

Cumulative percentage of stars with a planet^ 





P< 11.5d 


100 d 


lyr 


2.8 yr 


5.2 yr 


11.2 yr 














/Anon A\ 




a < 0.1 AU* 


0.42 AU 


1 AU 


2AU 


3AU 


5 AU 


M>2Mj 





0.43 (0.3) 


1.1 (0.5) 


1.9(0.6) 


2.6 (0.7) 


4.2 (0.9) 







0.42 (0.3) 


1.1 (0.5) 


1.9 (0.6) 


2.5 (0.7) 


4.0 (0.9) 







0.45 (0.3) 


1.1 (0.5) 


2.1 (0.7) 


2.8 (0.8) 


3.0 (0.8) 







0.42 (0.3) 


1.1 (0.5) 


1.9(0.7) 


2.5 (0.8) 


2.8 (0.8) 


IMj 


0.43 (0.3) 


0.85 (0.4) 


1.9 (0.6) 


3.9 (0.9) 


4.6 (1.0) 


8.9 (1.4) 




0.42 (0.3) 


0.85 (0.4) 


1.9 (0.6) 


3.8 (0.9) 


4.4 (1.0) 


8.3 (1.4) 




0.46 (0.3) 


0.9 (0.4) 


2.1 (0.7) 


4.3 (1.0) 


5.0 (1.0) 


6.3 (1.2) 




0.42 (0.3) 


0.85 (0.4) 


1.9 (0.7) 


3.8 (1.0) 


4.4 (1.0) 


5.5 (1.2) 


0.5 Mj 


1.1 (0.5) 


1.9 (0.6) 


3.3 (0.8) 


6.3 (1.2) 


7.5 (1.3) 






1.1 (0.5) 


1.9 (0.6) 


3.2 (0.8) 


5.9 (1.2) 


7.0 (1.3) 






1.2 (0.5) 


2.1 (0.7) 


3.5 (0.9) 


6.2 (1.2) 


7.2 (1.2) 






1.1 (0.5) 


1.9 (0.7) 


3.2 (0.9) 


5.5 (1.2) 


6.4 (1.2) 




0.3 Mj 


1.5 (0.6) 


2.4 (0.7) 


3.7 (0.9) 


6.7 (1.2) 


8.5 (1.3) 






1.5 (0.6) 


2.3 (0.7) 


3.6 (0.9) 


6.4(1.2) 


7.6(1.3) 






1.6(0.6) 


2.6 (0.7) 


4.0 (0.9) 


6.7(1.2) 


7.7(1.3) 






1.5 (0.6) 


2.3 (0.7) 


3.6 (0.9) 


5.9 (1.2) 


6.8 (1.3) 




0.1 Mj 


2.0 (0.7) 


3.9 (0.9) 












1.9 (0.7) 


3.4 (0.9) 












2.2 (0.7) 


4.3 (1.0) 












1.9(0.7) 


3.4(1.0) 











" The percentage of F,G, and K stars with a planet at a shorter orbital period than the one given, and with Mp sin ; greater than the given mass, but less than 
15 Mj. In each case, we give four values: (1) the percentage derived by including both announced and unannounced detections, (2) the same as (1) but without 
completeness corrections, (3) the percentage derived from including the announced planets as detections but treating the unannounced detections as upper limits, 
(4) same as (3) but without completeness corrections. The total number of stars is 475. The Poisson error is given in parentheses after each entry. See §2.1 for a 
description of the stellar sample selection and distribution of spectral types.'' The corresponding semimajor axis for a solar mass star. 



3.3. Results for F, G, K dwarfs 

We first present results for the 475 stars with M^, > 0.5 Mq 
which have F, G, and K spectral types (see §2.1 for a discus- 
sion of the sample selection and the range of spectral types). 
We do not attempt a detailed study of the stellar mass de- 
pendence of planet occurrence rate or orbital parameters in 
this paper. Fischer & Valenti (2005) show that the apparent 
rate of occurrence of planets increases by a factor of 2 for 
stellar masses between 0.75 and 1.5M0. Investigating the 
stellar mass dependence of planet properties requires untan- 
gling it from the effect of stellar metallicity, beyond the scope 
of this paper (e.g., see the recent discussion in Johnson et 
al. 2007). However, several recent papers have pointed out 
that the planet occurrence rate around M dwarfs appears to 
be several times lower than around F, G and K stars (Butler 
et al. 2004b, 2006b; Endl et al. 2006; Johnson et al. 2007). 
To avoid biasing our results for the occurrence rate of planets, 
and to investigate the occurrence rate around M dwarfs fur- 
ther, we treat stars with masses < 0.5 Mq separately in §3.4. 
In addition, three of the 48 stars with announced planets were 
added to the survey to confirm detections by other groups. To 
ensure a fair sample, we remove these stars from our analysis. 

Figure 9 shows the results of the calculation described in 
§3.1. We plot the mass and period of announced planets 
(black circles) and candidate detections (green circles), as 
well as the upper limit contours (blue curves) as in Figure 
8. The value of A^, for each detection is indicated by the area 
of the circle. The smallest circles, well above the detection 
threshold, correspond to A', = 1 . At lower masses, the values 
of Ni are roughly consistent with the simple argument given in 
§3.1, that if the detection lies at a mass and period excluded 
by a fraction k of the upper limits, then roughly A^, l/k. 



For K ^ 10 ms"', the completeness corrections are roughly a 
factor of two, and are close to unity for > 20 m s"' . 

Figure 10 shows the result of our power-law fit (§3.2). We 
fit the data in the mass range 0.3-10 Mj and 2-2000 days, 
which includes 32 announced planets, and 4 candidate de- 
tections. The constraints on a. and /? are shown in Figure 
10. The best fit values which maximize the likelihood are 
a = -0.31 ± 0.2 and /3 = 0.26 ±0.1. The error in a is larger 
than the error in (3, presumably because the dynamic range 
in orbital periods is greater than in the planet masses. The 
best fitting power law gives a fraction of stars with a planet 
of 10.5% in this mass and period range, which compares with 
the value of 8.5% derived from our completeness corrections 
(see Table 2). The values of a and (3 are shghtly correlated, 
in the same sense as Tabachnik & Tremaine (2002) observed, 
but to a smaller degree. 

3.3.1. Fraction of stars with a planet 

Summing Ni in a particular region of the mass-orbital pe- 
riod plane and dividing by A^* gives the fraction of stars with 
planets in that region. The percentage of stars with a planet 
more massive than a given mass and closer to the star than 
a given orbital period is listed in Table 1 . In parentheses we 
give the Poisson error based on the number of detections. For 
each table entry, we give four values. The first two are the 
percentages derived by including both announced planets and 
unannounced candidates, with and without completeness cor- 
rections included. The third and fourth values are the per- 
centages derived by including the announced planets only (in 
which case the velocity amplitude of each candidate is treated 
as an upper limit on velocity amplitude rather than a detec- 
tion), with and without completeness corrections included. 
The two values are similar over most of Table 1. The largest 
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Fig. 10. — Contours of constant likelihood for a power law fit dN oc 
M"pl^ dlnM dlnP. The contours correspond to the 68% and 95% confi- 
dence intervals (AL/2 = 1 and 4). The best fitting values are a = —0.31 and 
3 = 0.26. We include announced planets and candidates in the period range 
2-2000 days and mass range 0.3-10 Mj in the fit. 
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Fig. 12. — Same as Figure 11, but now for short orbital periods P < 
30 days, and for masses Mpsini > 0.1 Mj. All detections correspond to 
aimounced planets in this period and mass range. 
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Fig. 1 1 . — Distribution of orbital periods for planets with periods< 2000 
days and mass MpSini > 03Mj. In the lower panel, the dotted histogram 
shows the number of detections in each bin, including announced and can- 
didate detections; the soUd histogram shows this number corrected for com- 
pleteness. Error bars indicate Vn for each bin. The upper panel shows the 
cumulative percentage of stars with a planet with orbital period smaller than 
a given value. 



differences of w 30% are at long periods > 2000 days, where 
there are very few announced planets, and many candidate 
periodicities. 

3.3.2. Distribution of orbital periods 

Figure 1 1 shows the orbital period distribution for planets 
with Mpsin; > 0.3 Mj for orbital periods up to 2000 days, 
beyond which the detectability declines as the orbital period 
approaches the duration of the survey. In the lower panel, 
the dotted histogram is the distribution of detections, includ- 
ing announced and candidate detections; the solid histogram 
is the distribution of detections after correcting for complete- 
ness, i.e. summing Ni in each bin. For each bin, we indicate 



Model 


P < 5.2 yr 

a < 3 AU 


11 yr 
5 AU 


32 yr 
10 AU 


89 yr 
20 AU 


flat 

dN/dlog^gP = 6.5% 
power law 
d]nN/d]iiP = 0.26 


8.5 (1.3) 
8.5 (1.3) 


11 (1.7) 
11(1.7) 


14(2.1) 
14 (2.3) 


17 (2.6) 
19 (3.0) 



^ The cumulative percentage of stars with a planet A'(< P), based on either 
a flat extrapolation or power law extrapolation beyond P = 2000 days, for 
planet masses 0.3 < Mpsini < 15 Mj. The semrmajor axes are for a solar 
mass star. The extrapolated Poisson error is indicated in parentheses. 



the expected Poisson ^/N errors based on the number of detec- 
tions but rescaled by the ratio of J2 that bin to the number 
of detections in that bin. The upper panel shows the cumu- 
lative histogram, showing the fraction of stars with a planet 
within a given orbital period. For clarity, we do not show 
the Poisson errors on the cumulative histogram, but they can 
be calculated based on the total number of stars. For exam- 
ple, approximately 2.4% of stars have a planet more massive 
than 0.3 Mj with an orbital period of < 100 days. This rep- 
resents 11 .3 ± 3.4 stars out of the total of 472, or a fraction 
2.4 ±0.7%. 

At the shortest periods, the period distribution shows the 
well known pile up of planets at orbital periods close to 3 
days. This is illustrated in more detail in Figure 12 which 
shows the period distribution for those planets with P < 30 
days, and masses Mpsini > 0.1 Mj (the increased detectabil- 
ity of close in planets means that we can go to lower masses 
than in Fig. 11). All of these planets are announced; there are 
no candidate planets in this range of mass and period. But- 
ler et al. (1996) mention that there are no significant selec- 
tion effects that would lead to this pile up: we can see that 
very clearly in Figure 9, in which the upper limit curves con- 
tinue smoothly to periods as short as 1 day with no change 
in detectability. The statistical significance of the pile up in 
our data depends on the model and range of orbital periods 
against which it is compared. A Kolmogorov-Smimov (KS) 
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test gives a 0.4% probability that the observed distribution is 
drawn from a uniform distribution in \ogP in the decade 1 to 
10 days. If we extend the uniform distribution out to 100 days, 
the KS probability is 20%. 

As we described earlier, a power law fit to the period distri- 
bution gives dN/d\nP oc which rises to longer periods. 
However, an alternative description of the distribution is a 
rapid increase in the planet fraction at orbital periods of w 300 
days. Figure 1 1 shows that there is a change of slope in the 
cumulative distribution which suggests that the planet fraction 
increases beyond orbital periods of « 300 days. The change 
in slope does not depend on whether the candidate planets 
are included. If we assume that the orbital period distribution 
above and below 300 days is flat, we find that the fraction of 
stars with a planet per decade is dN/dlog^QP = 1.3 ± 0.4% 
at short periods, and dN/dlog^(^P = 6.5 ± 1.4% at long pe- 
riods (the latter becomes dN / dlog^^P = 5.1 ± 1.2% if only 
announced planets are included). Therefore, the incidence of 
planets increases by a factor of 5 for periods longwards of 
w 300 days. The low planet fraction at intermediate orbital 
periods has been noted previously. Jones et al. (2003) and 
Udry et al. (2003) both pointed out that there is a deficit of 
gas giants at intermediate periods, P ~ 10-100 days. 

We have focused on the period distribution for P < 2000 
days, but Figure 9 shows that there are many candidate plan- 
ets with orbital periods P > 2000 days. In our analysis, we 
have assumed that the minimum mass and orbital period of 
detected companions are well-determined. However, for or- 
bital periods longer that the time-baseline of the data, this is 
not the case: there exists a family of best-fitting solutions with 
a range of allowed orbital periods, masses, and eccentricities 
(see e.g. Ford 2005; Wright et al. 2007). To constrain the dis- 
tribution at long orbital periods requires taking into account 
the distributions of orbital parameters allowed by the data for 
each star. For now, we extrapolate the period distribution de- 
termined for P < 2000 days to predict the occurrence rate of 
long period orbits assuming that either the flat distribution or 
the power law oc P^ holds for longer orbital periods. The re- 
sults are given in Table 2. For example, if the distribution is 
flat in logP beyond 2000 days, we expect that 17% of solar 
type stars harbor a gas giant (Saturn mass and up) within 20 
AU. In the power law case, the fractions are larger, but not 
substantially larger because of the small value of /3 = 0.26. 
These extrapolations are consistent with the number of candi- 
dates we find at long orbital periods. If we sum the confirmed 
planets and candidates at long periods, taking the complete- 
ness corrections into account, and assuming that the fitted or- 
bital periods are the correct ones, we find that 18% of stars 
have a planet or candidate within 10 AU. This is the same 
fraction as our extrapolations suggest for orbital separations 
less than 20 AU. However, the uncertainties in orbital param- 
eters need to be taken into account before we can use the long 
period candidates to learn about the period distribution at long 
orbital periods. 

3.3.3. The mass Junction of planets 

The Mpsini distribution of planets with Mpsini > 0.3 Mj 
and P < 2000 days is shown in Figure 13. As we have 
discussed, a power law fit to the mass function gives 
dN/dlnM oc M", with a = -0.31 ± 0.2, so fliat flie distribu- 
tion in In M is approximately flat, but slowly rising to lower 
masses. Our value for a agrees with the dN/dM oc M~'-' 
found by Butler et al. (2006a) by fitting the mass function 
of planets detected in the combined Keck, Lick, and AAT sur- 




FlG. 13. — The distribution of Mp sin; for planets with P < 2000 days and 
Mp sin I > 0.3 Mj. In the lower panel, the dotted histogram shows the nvimber 
of detections in each bin; the solid histogram shows the number corrected for 
completeness. The cumulative distribution is shown in the upper panel. 



veys (see also Jorissen, Mayor, & Udry 2001). The cumula- 
tive distribution in the upper panel of Figure 1 3 (which shows 
the fraction of stars with a planet more massive than a given 
mass) shows a corresponding close to linear increase to lower 
masses. The absence of a turnover in the cumulative distribu- 
tion at low masses shows that our results are consistent with 
the mass function remaining approximately flat in InM to the 
lowest masses with good detectability. 

An important question is whether the mass function is de- 
pendent on orbital period. In Figures 14 and 15, we show the 
mass distributions in three different period ranges: periods 
less than 10 days, between 10 days and 1 year, and greater 
than 1 year. In the context of theoretical models for planet 
formation and migration, these ranges correspond to planets 
that have undergone different amounts of migration (e.g. Ida 
& Lin 2004). For orbital periods less than a year, we give 
the distribution down to 0.1 Mj, but for longer orbital periods 
where detectability is not as good, we restrict the mass range 
to > 0.3My. We detect no close in planets with M > 2Mj. 
This lack of close, massive planets has been noted before, and 
is significant: Figure 9 shows that the survey is complete in 
that region of the mass-period plane. At the longest orbital pe- 
riods, there are almost as many detections in the mass range 
0.3 < Mpsin/ < 1 (half a decade) as there are in the range 
1 < Mp sin; < 10 (a fuU decade), suggesting a steeper fall off 
with mass than the overall mass distribution. However, this 
depends on future confirmation of the candidates with addi- 
tional observations, and depends on the completeness correc- 
tions, which are significant in the lowest mass bin. 

Ida & Lin (2004) predict a mass-orbital period "desert" at 
low masses and intermediate orbital periods. Our data show 
no evidence for a drop in the planet occurrence rate at low 
masses, as can be seen in the central panel of Figure 14, al- 
though as Figure 9 shows, we are not able to address the distri- 
bution for masses < 0.1-0.2 Mj in this period range because 
of selection effects. In their study of the mass-period distri- 
bution, Udry et al. (2003) found evidence for a deficiency of 
planets with Mp sin / < 0.75 Mj at orbital periods ^100 days. 
They simulated the detectability of planets in that region, and 
concluded that detections would have been made if the mass 
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Fig. 14. — Histogram of Mpsin/ for planets in different period ranges. 
The dotted histogram shows the number of detections in each bin; the solid 
histogram shows the number corrected for completeness. 

distribution of planets at P > 100 days was similar to that of 
the hot jupiters. Interestingly, we find Saturn mass candidates 
at periods longer than 1000 days, but not in the range 100- 
1000 days. However, the small number of detections in that 
region of the mass-period plane and the fact that the com- 
pleteness corrections are significant there mean that we can- 
not come to any definite conclusions. 

Fitting a uniform distribution in logM to the distributions 
in Figure 14, we find a fraction per decade of dN / dlogy^M = 
1.8 ±0.6% for P < 10 d, 1.9 ±0.5% for 10 d < P < 1 yr, 
and 3.9 ± 0.9% for 1 yr < P < 2000 days. It is interesting to 
extrapolate this distribution to lower masses. For example, at 
short periods P < 10 days, we expect 3.1% of stars to have a 
planet more massive than lOM®, and 4.7% to have an Earth 
mass planet or larger. For periods less than 1 year, extrapo- 
lation gives 7.4% of stars with Mp > 10 Mq, and 11% with 
Mp > 1 Mq . 

3.3.4. Comparison with previous work 

Our results for the fraction of stars with planets and the 
mass and orbital period distributions are generally consistent 
with previous determinations. Marcy et al. (2005a) estimate 
that 12% of stars have a gas giant within 20 AU, based on a flat 
extrapolation of the orbital period distribution of the detected 
planets in the combined Lick, Keck, and Anglo-Australian 
surveys. If we take announced planets only without complete- 
ness corrections, we find that the fraction of stars with a planet 
per decade is 4.8 ± 1.0%, which extrapolates to 13 ±2.5% 
within 20 AU in good agreement with Marcy et al. (2005a). 

Tabachnik & Tremaine (2002) fit the mass and period dis- 
tributions with a double power law, dN cx M^pf^dlnMdlnP, 
accounting for selection effects by estimating the detection 
thresholds for each of the Doppler surveys. They obtained 



0.025 
0.02 

0.015 
0.01 

0.005 


(/) 0.04 

D 

^ 0.03 

o 

c 0.02 



o 



0.04 '- 
0.03 E- 

0.02 
0.01 f- 




T ' "I 

P < 10 d 



i I I I I ll| 1 — I I I I 1 — I I I I l ll| 

~ lOd < P < lyr 



0.01 



r I I I I II \ — I II I M il \ — h-Hhh 

0.05 '- 



^ lyr < P < 2000d 



_l I I 



0.1 



10 



Mp sin i (Mj) 



Fig. 15. — Cumulative distribution of Mp sin / for planets in different period 
ranges. 



a = -0. 1 1 ± 0. 1, which agrees with our value a = -0.3 1 ± 0.2, 
and /3 = 0.27 ± 0.06, which agrees with our /3 = 0.26 ±0.1. 
Our error bars are larger than those of Tabachnik & Tremaine 
(2002) because in their work they included several surveys, 
and so have a larger total number of stars in their sample (their 
results for the Keck survey alone have similar error bars to our 
results). Extrapolating, Tabachnik & Tremaine (2002) found 
that 4% of solar type stars have planets with IMj <M < lOMj 
and 2 d < P < 10 yr. Our results give 6-9% depending on how 
the candidate detections are included (Table 1). 

Lineweaver & Grether (2003) also fit a double power law 
in mass and period. Their technique was to define an area 
in which they estimate all planets have been detected around 
stars currently being surveyed, and extrapolate to longer pe- 
riods and lower masses. For the mass function, they found 
a = -0.8 ± 0.3, i.e. a significant rise in the mass function at 
low masses. We do not observe such a rise in Figure 13. For 
the period distribution, they found /? = 0.7 ± 0.3 which again 
indicates a rise in the planet occurrence rate at long periods. 
They estimated that 9% of stars have a planet with M > 0.3My 
and P < 13 yrs. Our extrapolation suggests a fraction 1 1% in 
this mass and period range (see Table 2). 

Naef et al. (2005) present an analysis of data for 330 stars 
with 18 detected planets from the ELODIE Planet Search 
program. They give the fraction of stars with planets more 
massive than 0.5M/ within three different orbital periods: 
0.7 ±0.5% for P < 5 d, 4.0 ± 1.1% for P < 1500 d, and 
7.3 ± 1.5% for P < 3900 d. For the same mass and period 
ranges, we find 0.65 ±0.4%, 6.9 ± 1.2% and 12 ± 1.6% (this 
last number is 8.6 ± 1.3% if only announced planets are in- 
cluded). 



3.4. Results for M dwarfs 
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Fig. 16. — Same as Figure 9, but .showing only the detections for the 110 
stars with M* < 0.5Mq . The blue curves show the upper limits corresponding 
to this mass range. The dotted lines show velocity amplitudes of 3, 10 and 
30 m s"' for a O.3M0 star. 

We now turn to the M dwarfs in the sample. Figure 16 
shows the results of the calculation described in §3.1 applied 
to the 1 10 stars with M < 0.5 Mq. Having detected 46 planets 
from 475 F, G, and K stars, we would expect 1 1 planets in this 
sample of M dwarfs if the planet frequency and detectability 
were the same. Instead, there are only 2 announced planets: 
GJ 876, which is in fact a triple system but our search detects 
only the most massive planet at P = 60 days, and GJ 436, a 
0.07 Mj planet in an orbit of less than 3 days. 

Several recent papers have addressed the apparent paucity 
of gas giant planets around M dwarfs. In their paper an- 
nouncing the discovery of the Neptune-mass planet orbiting 
GJ 436, Butler et al. (2004b) estimated that the planet frac- 
tion for M dwarfs is w 0.5% for masses > 1 Mj and periods 
< 3 years, roughly an order of magnitude lower than around 
F and G main sequence stars. Butler et al. (2006b) announced 
the detection of a 0.8 Mj planet in an orbit with P = 1890 
days around GJ 849. Including both GJ 876 and GJ 849, 
they estimate a planet fraction 2/1 14= 1.8±1 .2% for planet 
masses > 0.4 Mj and periods a < 2.5 AU. In the same range 
of a but with a larger mass limit Mpsini > 0.8My, Johnson 
et al. (2007) find that this fraction is 1.8 ± 1.0% for stars in 
the mass range O.1-O.7M0. Endl et al. (2006) give a limit on 
the fraction of M dwarfs with planets of < 1.27% at the Icr 
confidence level. This result is less constraining, since they 
estimate that their survey of 90 M dwarfs is 95% complete 
for Mp sin / > 3.5 My and a < 0.7 AU, and Table 1 shows that 
we find a planet fraction of « 1% for planets in this mass 
and period range around FGK stars. These observational con- 
straints agree with predictions of core accretion models for 
planet formation, which find that Jupiter mass planets should 
be rare around M dwarfs, with the mass function of planets 
shifted towards lower masses (Laughlin et al. 2004; Ida & Lin 
2005; Kennedy & Kenyon 2008) (although Kornet & Wolf 
2006 predict a greater incidence of gas giants at lower stellar 
masses if the protoplanetary disk parameters do not change 
with stellar mass). 

With so few detections in our sample we obviously cannot 
say anything about the mass-period distribution. However, we 
can address the possible role of selection effects and constrain 
the relative fractions of stars with planets around M dwarfs 



compared to FGK stars. We again take a maximum likeli- 
hood approach. We assume that the mass-period distribution 
for M dwarfs is the same power law distribution that we fit for 
the larger sample of FGK stars, dN = C M^pi^ dlnM dlnP, 
with a = -0.31 and f3 = 0.26, but with a different normal- 
ization constant C. (The mass-period distribution is likely 
different for M dwarfs than FGK stars, but this is the sim- 
plest assumption given the available data). We then calcu- 
late the likelihood for the ratio of normalization constants 
r = C{M^ < 0.5 Mq)/C{M^ > 0.5 Mq). This is shown in 
Figure 17. We use the same mass range 0.3-10 My, and pe- 
riod range 2-2000 days as in §3.1. The best fitting value is 
r = 0.095, indicating that M dwarfs are 10 times less likely to 
harbor a gas giant within 2000 days. The 95.4% (2(t) upper 
limit on the relative planet fraction is 0.5 1 . Using the normal- 
ization from the power law model (which for the best-fitting 
model has 10.5% of stars with planets for FGK stars), we find 
the best-fitting M dwarf planet fraction to be 1 .0%, with a 2a 
upper limit of 5.4%. 

The shape of the curve in Figure 17 is straightforward to un- 
derstand. In the period range P < 2000 days and mass range 
0.3-10 My, there are 35 detections out of 475 FGK stai's, a 
fraction of 7.4%. If the planet fraction around M dwarfs is 
r times the planet fraction around FGK dwarfs, we therefore 
expect to find 8 . 1 r detections amongst the 1 10 M dwarfs. The 
probability of detecting 1 planet is then (8 . 1 r) exp(-8 . 1 r) from 
the Poisson distribution. Using Bayes' theorem (e.g. Sivia 
1996), we can view this expression as the probability distribu- 
tion function for r given the measured number of detections. 
We plot this as the dotted curve in Figure 17. The fact that 
the dotted curve lies to the right of the maximum likelihood 
result indicates that the selection effects favor detection of gas 
giants around M dwarfs by about 25% (if we increase the ex- 
pected number by 25%, from 8.1r to lOr, the solid and dotted 
curves lie almost on top of each other). Although the Doppler 
errors and upper limits on velocity amplitude are generally 
greater for the M dwarfs, the lower stellar mass means that 
a gas giant planet intrinsically gives a larger velocity signal. 
The net effect is a slightly larger detectability of gas giants for 
M dwarfs. 

This result shows that the deficit of gas giants around M 
dwarfs is statistically significant in our sample, and is not due 
to selection effects against finding companions to M dwarfs. 
However, the best-fitting value of the ratio r is subject to small 
number statistics (Fig. 17). An illustration of this is the re- 
cently announced companion to GJ 849 (Butler et al. 2006b), 
which is one of the long period candidates shown in Figure 16. 
The data we analyse here do not include recent observations 
of this star taken in 2005 and 2006 that show the closure of the 
orbit. As a result, our search algorithm finds an orbital period 
of 4400 days, more than twice as long as the true orbital pe- 
riod of 1890 days. Therefore GJ 849 is actually just inside the 
region considered above {P < 2000 days), suggesting that the 
best current estimate for the M dwarf planet fraction is « 2% 
(Butler et al. 2006b) within 2000 days. The dashed curve in 
Figure 17 shows the result if we include the companion to 
GJ 849 in our calculation (by correcting the orbital period to 
the announced value by hand). Again, our result is well ap- 
proximated by a Poisson distribution (but this time with two 
detections) if the expected number is increased by 25%. The 
best fit relative fraction is r = 0. 19, which corresponds to 2.0% 
of M dwarfs with gas giants within 2000 days. The 2a confi- 
dence limit on r is r < 0.65. 
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Fig. 17. — Relative fraction of stars with a planet for stars with M« < 

0. 5 Mq compared with those with M* > 0.5 Mq. For each set of stars, we 
assume the mass-period distribution is dN = CM^P^dhiMdlnP with a = 
-0.31 and /3 = 0.26 (the best fit values for the whole sample, see Fig. 10) 
and plot the likeUhood of the ratio of the normalization factors r. The dotted 
curve shows the expected distribution if selection effects are not important, 

1. e. based on Poisson counting statistics only. The dashed curve shows the 
result if we include the recently annovmced companion to GJ 849. 



4. SUMMARY AND CONCLUSIONS 

We have carried out a systematic search for planets using 
precise radial velocity measurements of 585 stars from the 
Keck Planet Search. The number, duration, and frequency 
of observations, and typical Doppler measurement errors are 
summarized in Figures 2 and 3. This analysis provides a snap- 
shot of the Keck Planet Search at the time of the HIRES spec- 
trometer upgrade in 2004 August. 

We systematically searched for planets by calculating the 
false alarm probability associated with Keplerian orbit fits to 
the data for each star (C04; Marcy et al. 2005b; O'Toole et 
al. 2007). This method allows the detection threshold for each 
star to be understood in terms of the number and duration of 
the observations, and the underlying "noise" from measure- 
ment errors, intrinsic stellar jitter, or additional low mass plan- 
ets. The results are summarized in Figure 5. We find that all 
planets with orbital periods P < 2000 days, velocity amph- 
tudes A" > 20 m s~\ and eccentricities e < 0.6 have been an- 
nounced. For stars without a detection, upper limits (Fig. 7) 
are typically 10 m s~' for orbital periods less than the dura- 
tion of the observations, and increase a for longer periods 
(see C04 for a discussion of the period dependence of the de- 
tection threshold). The upper limits constrain the presence of 
additional planets, and allow us to study the mass and orbital 
period distribution. In section 3, we described a method to 
calculate the completeness corrections to the mass-orbital pe- 
riod distribution at low masses and long orbital periods. Our 
method is a generaUzation of the iterative method of Avni et 
al. (1980) to two dimensions. In the Appendix, we show that 
our approach corresponds to a maximum likelihood method 
with simple approximations for the likelihood functions of de- 
tections and non-detections. 

The resulting completeness corrections for the 475 F, G and 
K stars in the sample are summarized in Figure 9, and Ta- 
ble 1 gives the fraction of stars with a planet as a function 
of minimum mass and orbital period (see §2.1 for details of 



the stellar sample including the distribution of spectral types). 
For masses > 0.3 My, the detectability is good for periods as 
large as 2000 days. A power law fit to the data in this range 
gives a mass-period distribution dN = C M^P^dlnMdhiP 
with a = -0.31 ± 0.2 and /3 = 0.26 ±0.1. The normaUzation 
constant C is such that the fraction of FGK stars with a planet 
in the mass range 0.3-10 Mj and period range 2-2000 days 
is 10.5%. In units corresponding to measuring planet masses 
in Jupiter masses and orbital periods in days, the value of the 
normahzation is C = 1 .04 x 10"^. 

Table 2 shows the expected planet fractions obtained by ex- 
trapolating our results out to long periods. We estimate that 
17-19% of stars have a gas giant planet within 20 AU. Extrap- 
olating to low masses gives 11 % of stars with an Earth mass 
planet or larger within 1 AU. This extrapolation is uncertain, 
since it takes the distribution derived for gas giant planets into 
the mass range of rocky planets, for which the formation and 
migration history is presumably quite different, e.g. Ida & 
Lin (2004a). A similar uncertainty applies for our extrapola- 
tion to long orbital periods also, since for example the relevant 
timescales for planet formation grow longer at larger orbital 
radii, although outwards migration can populate long period 
orbits (Veras & Armitage 2004; Martin et al. 2007). 

We find several interesting features in the mass-period dis- 
tribution. Massive planets (> 2 Mj) are rare at short orbital 
periods, as has been noted previously. There is no significant 
evidence for a lower cutoff in the mass function at interme- 
diate orbital periods, down to planet masses of 0.1-0.2 Mj. 
Therefore we do not see any evidence yet for the planet desert 
proposed by Ida & Lin (2004a). For orbital periods longer 
than a year, there are almost as many detections in the mass 
range 0.3 < Mpsini < 1 (half a decade) as there are in the 
range 1 < Mpsin; < 10 (a full decade), suggesting a steeper 
fall off with mass than the overall mass distribution at long 
periods. However, this result depends on several candidate 
detections with < Mj in this period range that await confir- 
mation. A dependence of the mass function on orbital pe- 
riod might indicate differences in migration mechanisms for 
different planet masses (Armitage 2007). The orbital period 
distribution shows an increase in the occurrence rate of gas gi- 
ants of a factor of 5 beyond P « 300 days. Theoretical models 
of planet formation generally predict a smooth increase in the 
incidence of gas giants at longer orbital periods due to the in- 
creasing rate of migration as a planet moves inwards through 
the protoplanetary disk (e.g. Figure 1 of Ida & Lin 2004b; 
Figure 5 of Armitage 2007). The sharp increase in the period 
distribution at P w 300 days shown in Figure 1 1 may reflect 
some particular radial structure in the protoplanetary disk. Ida 
& Lin (2008b) propose an explanation for this upturn based 
on a surface density enhancement at the ice fine due to a local 
pressure maximum in the disk. Detailed comparisons between 
these various models and our results would constrain parame- 
ters such as the ratio of migration to disk depletion timescales 
(e.g. Ida & Lin 2004; Armitage 2007). 

Because of the small number of detections in the sample of 
110 M dwarfs, we are not able to constrain the mass-period 
distribution for these stars. However, by assuming that the 
mass-period distribution is the same for M dwarfs as for more 
massive stars, we constrained the occurrence rate of planets 
relative to the FGK stars, taking into account possible differ- 
ences in detectabiUty between the two groups. Our results 
shows that the occurrence rate of gas giants within 2000 days 
is r = 3-10 times smaller for M dwarfs than FGK dwarfs 
(Fig. 17), with a two sigma Umit r > 1.5. A lower inci- 
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dence of Jupiter mass planets around M dwarfs is predicted by 
core accretion models for planet formation (Laughlin, Boden- 
heimer, & Adams 2004; Ida & Lin 2005; Kornet & Wolf 2006; 
Kennedy & Kenyon 2008) . Comparing with Figure 8 of Ida & 
Lin (2005), we find that both the absolute and relative occur- 
rence rates that we derive for Jupiter mass planets agree best 
with their standard model, in which disk mass is an increasing 
function of stellar mass (cx Ml). Kennedy & Kenyon (2008) 
scale their disk mass oc M^,, but include a detailed calculation 
of the position of the snow line. Their Figure 7 shows that the 
probability of having at least one giant planet is 6 times lower 
for 0.4 Mq star than a 1 Mq star, within the range found in 
this paper. We can rule out a larger gas giant planet fraction 
for M dwarfs than for solar mass stars, as found by Komet 
& Wolf (2006) for models in which the disk parameters were 
independent of stellar mass. 

Our calculations can be improved in several respects. 
First, we neglected eccentricity when accounting for non- 
detections. This is reasonable for most values of e, since ec- 
centricity has a large effect on detectability for e > 0.6 (Endl 
et al. 2002; C04). However, the population of high eccen- 
tricity planets (e > 0.6) is not well constrained. In addition, 
there are more subtle selection effects involving eccentricity. 
For example, Gumming (2004) showed that non-zero eccen- 
tricity enhances detectability for orbital periods longer than 
the time baseline of the data, introducing a bias in the longest 
period orbits towards systems with e > 0. Further analysis 
is required to study the eccentricity distribution and the or- 
bital period distribution of long period planets. Our data po- 
tentially allow us constrain the distribution of orbital periods 
beyond the 8 year time baseline of the observations, but this 
will require averaging over the range of possible eccentricities 
for those outer planets. We did not include multiple planet 
systems. This introduces some uncertainty in our derived dis- 
tributions, since our technique identifies only a single planet. 
In a multiple system, this is the planet with the largest veloc- 
ity amplitude, so that the distributions derived here are for the 



most detectable planet in a system. Finally, our method for in- 
cluding the upper limits involves dividing the data into either 
detections or non-detections, which depends on the choice of 
detection threshold. A better approach would be to calculate 
the probabihties directly and include them in the analysis (see 
eq. [Al]). Techniques to evaluate the relevant Bayesian in- 
tegrals over the multidimensional parameter space have been 
discussed in the literature (Ford 2006; Ford & Gregory 2007; 
Gregory 2007a, 2007b). This approach will allow orbital ec- 
centricity, the uncertainty in parameters associated with long 
orbital periods, and multiple companions to be included. 
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APPENDIX 

CORRECTING FOR THE UPPER LIMITS: A MAXIMUM LIKELIHOOD APPROACH 

Likelihood function for detections and non- detections 

In this Appendix, we derive equation (9) for the completeness corrections starting with a maximum likelihood approach. 
We assume that the fraction of stars with a planet is Fp, with a mass-period distribution f(M,P) normalized so that 
/ d]nM J d]nP f{M,P) = Fp. The UkeUhood of the data for star i is 

Li = (1 -Fp)qi + j dlnM j d\nP f{M,P)pi{M,P), (Al) 

where pi(M,P) is the probability of the data given a planet of mass M and period P, and q, is the probability of the data given 
that no planet is present (see eqs. [10] and [11] of C04). To determine f(M,P), we maximize the total likelihood, which is the 
product over all stars of the individual likelihoods, L = YiiLi. 

Note that equation (Al) for the likelihood assumes that each star has either no planet or one planet, i.e. that the presence of a 
planet with mass M and period P excludes the possibiUty of additional planets with other masses and periods. This is consistent 
with our search algorithm described in §2.3 which fits a single Keplerian orbit, and for multiple planet systems identifies only 
the planet with the largest radial velocity amplitude. The distribution f{M,P) that we derive should therefore be considered as 
the mass-period distribution of the most detectable planet in a system. This is a close approximation to the true distribution since 
the number of multiple planet systems is ~ 10% of the total (e.g. Marcy et al. 2005a). To include the possibihty of multiple 
planets, the likelihood function can be derived by considering each bin in mass-period space as an independent Poisson trial (e.g., 
as Tabachnik & Tremaine 2002), however this reduces to equation (Al) when the expected number of planets per star is ^ 1 
(compare eq. [10] of Tabachnik & Tremaine 2002 with eq. [A5] below; see also Appendix B of Tokovinin et al. 2006 for a clear 
discussion). 

In this paper, rather than evaluate and ^, directly, we have classified each data set as either a detection or a non-detection. In 
the case of a detection, we make the approximation 

qi « 0, Pi(M,P) « MiPiSiM-MdS(P-Pd, (A2) 
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since we expect the likelihood to be strongly peaked near the best fit mass and period for a strong detection, with a vanishing 
probability that no planet is present. We write the mass, period, and velocity amplitude of detection i as M„ P,, and A",. For a 
non-detection, we expect 



(A3) 



(A4) 



since we are able to rule out velocity amplitudes above the upper limit /Tup but not below it, and it remains possible that the star 
has no planet. Substituting these approximations into equation (Al) gives 

r f(Mi,Pi) detection 
^' <^ 1 1 - J^^^^ . d InM J InP /(M, P) non-detection 

where we have used the normalization J dlnM J JlnP f{M,P) = Fp. The total likelihood is 

L = ^f{Mi,Pi) n d\nMd\nP f{M,P)\, (A5) 



1=1 



7=1 



where Nd is the number of detections out of A^* stars. Equation (A5) is a two-dimensional generalization of the likelihood function 

of Avni et al. (1980). 

As a quick aside to gain some intuition, let's assume that /(M,P) is a constant independent of M and P. In addition, assume 



that A^up is the same for each star with a non-detection. Then, 



Lo^Fp" (l-kFp) 



N,-Nj 



(A6) 



where k is the fraction of planets ruled out by the upper limit. To find the best fit value of Fp, we maximize L by setting 
dL/dFp = 0. This gives 

Fp = ff'- (A7) 

If the upper limit rules out the whole mass-period plane (in other words we can say that a star without a detection definitely does 
not have a planet) then k=l and Fp = Na /A^*. However, if A: < 1 then we can exclude only a fraction k of planets, and our estimate 
of Fp must therefore be larger. For example, if a third of planets He above the upper limit ik=l /3), we conclude that there are 
three times as many planets as we actually detect, Fp = SNa/Ni,. 

Maximizing the likelihood: non-parametric approach 

To proceed further, one could divide the mass-period plane into a grid and solve for f(M,P) in each grid cell (a non-parametric 
approach), or assume a parametric form for f{M,P) and find the parameters that maximize the likelihood. We follow Avni et 
al. (1980, §Vb) which is to discretize / at the locations of the detected planets, i.e. write J dlnM dlnP f(M,P) = J^fi' where we 
use the shorthand ft = f{Mi,Pi), the sum is over the detections, and the normalization is = Fp. This method gives f{M,P) in 
a non-parametric way, but without binning the distribution. The total log likelihood is 



logL = ^log/; -I- ^log 



i,Ki>Kapj 



(AS) 



where the sums with index i are over detections and the sum with index j is over all non-detections. 
We can now go ahead and maximize L with respect to the Na values of We set dL/ dfi = 0, which gives 



j,Kupj<Ki 



1 



(A9) 



where the sum with index j is over all non-detections with upper limits that exclude a companion with the amplitude of detection 
i, and we define 

z(/:up)=i- E (^^0) 

i,Ki>K^ 

which has a sum over all detections that have velocity amplitudes larger than the specified upper limit K^p. 

Equation (A9) is an equation for fi which can be solved iteratively, as in §3.1. However, we proceed a little further in order to 
make connection with the heuristic derivation given in §3.1. We write equation (A9) as 

and then sum both sides over the detections, from i = I to i = Nd. After changing the order of the sums on the right hand side and 
simplifying, we find the result 
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where the sum j is over all non-detections. Using this to rewrite equation (A9), we find 

fi 



(A13) 



where the sum is over all non-detections j which have upper limits that allow a companion with the same amplitude as detection i 
to be present. Equation (A13) is therefore equivalent to equation (A9), and with the final definition Nj = N^^fi, reduces to equation 
(9) of §3.1 when solved iteratively. 

The limit of small bin size 

One might worry that by following the distribution f{M,P) only at the location of the detected planets, we are missing those 
areas of the mass-period plane in which there are no detections. In fact, we show now that the converged solution has non-zero 
values for / only at the locations of the detected planets. We start with equation (A5) and divide the mass-period plane into grid 
cells, labelled by a so that we will write f{M,P) averaged over grid cell a as f(a). The grid cell containing detection i we write 
as a,. The hkehhood is then given by 



logL=^log/(ai) + ^log 



■ E 

a,K(a»K^pj 



f(a) 



(A14) 



In the last term, K(a) is the velocity amplitude associated with grid cell a, and so the sum is over only those grid cells that are 
constrained by the upper hmit j. We assume that the grid cells are small enough that the entire grid cell is either excluded or 
allowed by the upper limit; in practice only a part of the grid cell may be excluded by the upper limit, but we ignore this here for 
clarity. Now, maximizing logL with respect to the set of f(a) by setting dL/df{a) = 0, we find 



E 



1 



(A15) 



where Njia) is the number of detections in grid cell a and Z(K^^) = 1 — 'YliK(a)>K /C"^)- Following the same steps as in our 
derivation of equation (A13), we rewrite this as an iterative procedure. 



ma) 



E 



i,K^j>K(a) 



f"(a) 1 



which should be compared with equation (A13). Now subtracting /"(a) from both sides, we find 



1 



J,K^s,,j<K(a) 



(A16) 



(A17) 



where we use a result analogous to equation (A12). We see that the converged solution (/' 

-1 



■n+l 



= /'')is 



fia)=Npia) 



1 



J,Kupj<K{: 



a) 



(A18) 



Therefore, the converged solution has f{a) non-zero only in those grid cells which have detections (Ndia) > 0). Taking bins 
small enough to contain either one or zero planets, we see that we need evaluate / only at the locations of the detected planets, 
which is our starting point for equation (AS) (see also §V.b of Avni et al. 1980, who show that a solution of eq. [A5] can be 
derived which is a sum over delta functions evaluated at the detected points). 

Likelihood function for power law mass and period distributions 

We now derive the likelihood L for an assumed power law distribution dN = f(M,P)d\nMd\nP = C M^P^dlnMdlnP, where 
C is a normahzation constant. This is used in §3 to derive the best-fitting values of a and /? by maximizing L with respect to 
these parameters. The calculation here is very similar to Tabachnik & Tremaine (2002) except for our different form for L (see 
discussion following eq. [Al]). The log hkehhood is 



logL = J2 (alogM,- + /31ogP,- +logC) + ^log [l -C Ij (a, /3)] 



(A19) 



where 



Setting dL/dC = gives the relation 



dlnMdlnPM^P^. 



K~>Km 



^''=Et 



(A20) 



(A21) 
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which can be solved to determine C for given a and /3. We calculate the integrals Ijia.jS) numerically, taking into account the 
variation of the upper limit K^p with orbital period. Following Tabachnik & Tremaine (2002), after choosing the range of InP and 
InM we are interested in, we refine it to just cover the range of periods and masses of the detected planets, as this maximizes the 
likelihood. 

In §3.4, we divide the stars into two groups based on their mass, and, assuming fixed values of a and /?, fit separately for the 
two normalizations Ci = fC and C2 = C, where / is the relative planet fraction of group 1 (stellar masses < 0.5 Mq) compared 
with group 2 (stellar masses > 0.5 Mq). It is straightforward to show that in this case, C and / are determined by solving 



■' 7(2) 



where j( 1 ) or j(2) indicates a sum over the non-detections in group 1 or 2 respectively, and Ni and N2 are the number of detections 

in groups 1 and 2. 

To get a feel for the solution, it is helpful to solve equation (A21) for the constant C in the approximation that the integrals Ij 
are the same constant for all stars with non-detections. This gives 

m,P)d^nMdlnP= (^] M-^'d^^dlnP 

\nJ J^^^^^M"P0 dlnM dlnP ^ ' 

The last term counts the number of constraining observations, analogous to the factor k in equation (A7). In the case where we 
divide the stars into two groups, the ratio of normalizations that maximizes the likelihood is 

^ = ^^. (A24) 

The factor h/h accounts for the relative detectability of planets in group 1 or 2. If detectability is the same for the two groups of 
stars, h = h, the estimate for C1/C2 corresponds to simply counting the relative fractions of detections. 
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